Method and apparatus for processing video signal using affine prediction

ABSTRACT

The present disclosure provides a method for decoding a video signal including a current block based on an affine motion prediction mode (affine mode, AF mode), the method including: checking whether the AF mode is applied to the current block, the AF mode representing a motion prediction mode using an affine motion model; checking whether an AF 4  mode is used when the AF mode is applied to the current block, the AF 4  mode representing a mode in which a motion vector is predicted using four parameters constituting the affine motion model; generating a motion vector predictor using the four parameters when the AF 4  mode is used and generating a motion vector predictor using six parameters constituting the affine motion model when the AF 4  mode is not used; and obtaining a motion vector of the current block based on the motion vector predictor.

CROSS-REFERENCE TO RELATED APPLICATIONS

This application is a continuation of U.S. application Ser. No.17/363,648, filed on Jun. 30, 2021, which is a continuation of U.S.application Ser. No. 16/636,263, filed on Feb. 3, 2020, now U.S. Pat.No. 11,089,317, which is the National Stage filing under 35 U.S.C. 371of International Application No. PCT/KR2018/008845, filed on Aug. 3,2018, which claims the benefit of U.S. Provisional Application No.62/541,083, filed on Aug. 3, 2017, the contents of which are all herebyincorporated by reference herein in their entirety.

TECHNICAL FIELD

The present disclosure relates to a method and apparatus forencoding/decoding a video signal and, more specifically, to a method andapparatus for adaptively performing affine prediction.

BACKGROUND ART

Compression encoding means a series of signal processing technology fortransmitting digitalized information through a communication line or forstoring digitalized information in a form appropriate to a storagemedium. Media such video, an image, and a voice may be a target ofcompression encoding, particularly, technology that performs compressionencoding using video as a target is referred to as video compression.

Next generation video contents will have a characteristic of a highspatial resolution, a high frame rate, and high dimensionality of scenerepresentation. In order to process such contents, memory storage,memory access rate, and processing power technologies will remarkablyincrease.

Therefore, it is necessary to design a coding tool for more efficientlyprocessing next generation video contents.

SUMMARY

The present disclosure proposes a method for encoding and decoding avideo signal more efficiently.

In addition, the present disclosure proposes a method for performingencoding or decoding in consideration of both an AF4 mode that is anaffine prediction using four parameters and an AF6 mode that is anaffine prediction mode using six parameters.

Furthermore, the present disclosure proposes a method for adaptivelydetermining (or selecting) an optimal coding mode according to at leastone of the AF4 mode and the AF6 mode based on a block size.

Furthermore, the present disclosure proposes a method for adaptivelydetermining (or selecting) an optimal coding mode according to at leastone of the AF4 mode and the AF6 mode based on whether a neighbor blockhas been coded according to affine prediction.

Technical Solution

To solve the aforementioned technical problems,

-   -   the present disclosure provides a method for adaptively        performing affine prediction based on a block size.

Furthermore, the present disclosure provides a method for adaptivelyperforming affine prediction based on whether a neighbor block has beencoded according to affine prediction.

Furthermore, the present disclosure provides a method for adaptivelydetermining (or selecting) an optimal coding mode based on at least oneof the AF4 mode and the AF6 mode.

Furthermore, the present disclosure provides a method for adaptivelyperforming affine prediction based on whether at least one predeterminedcondition is satisfied. In this case, the predetermined condition mayinclude at least one of a block size, the number of pixels of a block, ablock width, a block height, and whether a neighbor block has been codedaccording to affine prediction.

The present disclosure can improve the performance of affine predictionby providing a method for adaptively performing affine prediction andperform more efficient coding by reducing complexity of affineprediction.

DESCRIPTION OF DRAWINGS

FIG. 1 is a block diagram illustrating a configuration of an encoder forencoding a video signal according to an embodiment of the presentdisclosure.

FIG. 2 is a block diagram illustrating a configuration of a decoder fordecoding a video signal according to an embodiment of the presentdisclosure.

FIG. 3 is a diagram for explaining a QT (QuadTree, referred to as ‘QT’hereinafter) block segmentation structure as an embodiment to which thepresent disclosure is applicable.

FIG. 4 is a diagram for explaining a BT (Binary Tree, referred to as‘BT’ hereinafter) block segmentation structure as an embodiment to whichthe present disclosure is applicable.

FIG. 5 is a diagram for explaining a TT (Ternary Tree, referred to as‘TT’ hereinafter) block segmentation structure as an embodiment to whichthe present disclosure is applicable.

FIG. 6 is a diagram for explaining an AT (Asymmetric Tree, referred toas ‘AT’ hereinafter) block segmentation structure as an embodiment towhich the present disclosure is applicable.

FIGS. 7A and 7B are diagrams for explaining an inter-prediction mode asan embodiment to which the present disclosure is applied.

FIG. 8 is a diagram for explaining an affine motion model as anembodiment to which the present disclosure is applied.

FIGS. 9A and 9B are diagrams for explaining an affine motion predictionmethod using a control point motion vector as an embodiment to which thepresent disclosure is applied.

FIG. 10 is a flowchart illustrating a process of processing a videosignal including a current block using an affine prediction mode as anembodiment to which the present disclosure is applied.

FIG. 11 is a flowchart illustrating a process of adaptively determiningan optimal coding mode based on at least one of an AF4 mode and an AF6mode as an embodiment (1-1) to which the present disclosure is applied.

FIG. 12 is a flowchart illustrating a process of adaptively performingdecoding based on the AF4 mode or the AF6 mode as an embodiment (1-2) towhich the present disclosure is applied.

FIG. 13 illustrates a syntax structure in which decoding is performedbased on the AF4 mode or the AF6 mode as an embodiment (1-3) to whichthe present disclosure is applied.

FIG. 14 is a flowchart illustrating a process of adaptively determiningan optimal coding mode from among motion vector prediction modesincluding the AF4 mode or the AF6 mode based on condition A as anembodiment (2-1) to which the present disclosure is applied.

FIG. 15 is a flowchart illustrating a process of adaptively performingdecoding according to AF4 mode or the AF6 mode based on condition A asan embodiment (2-2) to which the present disclosure is applied.

FIG. 16 illustrates a syntax structure in which decoding is performedaccording to the AF4 mode or the AF6 mode based on condition A as anembodiment (2-3) to which the present disclosure is applied.

FIG. 17 is a flowchart illustrating a process of adaptively determiningan optimal coding mode from among motion vector prediction modesincluding the AF4 mode or the AF6 mode based on at least one ofcondition B and condition C as an embodiment (3-1) to which the presentdisclosure is applied.

FIG. 18 is a flowchart illustrating a process of adaptively performingdecoding according to the AF4 mode or the AF6 mode based on at least oneof condition B and condition C as an embodiment (3-2) to which thepresent disclosure is applied.

FIG. 19 illustrates a syntax structure in which decoding is performedaccording to the AF4 mode or the AF6 mode based on at least one ofcondition B and condition C as an embodiment (3-3) to which the presentdisclosure is applied.

FIG. 20 is a flowchart illustrating a process of adaptively determiningan optimal coding mode from among motion vector prediction modesincluding the AF4 mode or the AF6 mode based on a coding mode of aneighbor block as an embodiment (4-1) to which the present disclosure isapplied.

FIG. 21 is a flowchart illustrating a process of adaptively performingdecoding according to the AF4 mode or the AF6 mode based on a codingmode of a neighbor block as an embodiment (4-2) to which the presentdisclosure is applied.

FIG. 22 illustrates a syntax structure in which decoding is performedaccording to the AF4 mode or the AF6 mode based on a coding mode of aneighbor block as an embodiment (4-3) to which the present disclosure isapplied.

FIG. 23 is a flowchart illustrating a process of adaptively determiningan optimal coding mode from among motion vector prediction modesincluding the AF4 mode or the AF6 mode based on at least one ofcondition A, condition B and condition C as an embodiment (5-1) to whichthe present disclosure is applied.

FIG. 24 is a flowchart illustrating a process of adaptively performingdecoding according to the AF4 mode or the AF6 mode based on at least oneof condition A, condition B and condition C as an embodiment (5-2) towhich the present disclosure is applied.

FIG. 25 illustrates a syntax structure in which decoding is performedaccording to the AF4 mode or the AF6 mode based on at least one ofcondition A, condition B and condition C as an embodiment (5-3) to whichthe present disclosure is applied.

FIG. 26 is a flowchart illustrating a process of adaptively determiningan optimal coding mode from among motion vector prediction modesincluding the AF4 mode or the AF6 mode based on at least one ofcondition A and a coding mode of a neighbor block as an embodiment (6-1)to which the present disclosure is applied.

FIG. 27 is a flowchart illustrating a process of adaptively performingdecoding according to the AF4 mode or the AF6 mode based on at least oneof condition A and a coding mode of a neighbor block as an embodiment(6-2) to which the present disclosure is applied.

FIG. 28 illustrates a syntax structure in which decoding is performedaccording to the AF4 mode or the AF6 mode based on at least one ofcondition A and a coding mode of a neighbor block as an embodiment (6-3)to which the present disclosure is applied.

FIG. 29 is a flowchart illustrating a process of generating a motionvector predictor based on at least one of the AF4 mode and the AF6 modeas an embodiment to which the present disclosure is applied.

FIG. 30 is a flowchart illustrating a process of generating a motionvector predictor based on an AF4_flag and an AF6_flag as an embodimentto which the present disclosure is applied.

FIG. 31 is a flowchart illustrating a process of adaptively performingdecoding according to the AF4 mode or the AF6 mode based on whether aneighbor block has been coded in an AF mode as an embodiment to whichthe present disclosure is applied.

FIG. 32 illustrates a syntax in which decoding is adaptively performedbased on the AF4_flag and the AF6_flag as an embodiment to which thepresent disclosure is applied.

FIG. 33 illustrates a syntax in which decoding is adaptively performedaccording to the AF4 mode or the AF6 mode based on whether a neighborblock has been coded in an AF mode an embodiment to which the presentdisclosure is applied.

FIG. 34 illustrates a video coding system to which the presentdisclosure is applied.

FIG. 35 illustrates a content streaming system to which the presentdisclosure is applied.

BEST MODE

The present disclosure provides a method for decoding a video signalincluding a current block based on an affine motion prediction mode(affine mode, AF mode), the method including: checking whether the AFmode is applied to the current block, the AF mode representing a motionprediction mode using an affine motion model; checking whether an AF4mode is used when the AF mode is applied to the current block, the AF4mode representing a mode in which a motion vector is predicted usingfour parameters constituting the affine motion model; generating amotion vector predictor using the four parameters when the AF4 mode isused and generating a motion vector predictor using six parametersconstituting the affine motion model when the AF4 mode is not used; andobtaining a motion vector of the current block based on the motionvector predictor.

In the present disclosure, the method may further include obtaining anaffine flag from the video signal, wherein the affine flag indicateswhether the AF mode is applied to the current block, and whether the AFmode is applied to the current block is checked based on the affineflag.

In the present disclosure, the method may further include obtaining anaffine parameter flag from the video signal when the AF mode is appliedto the current block according to the affine flag, wherein the affineparameter flag indicates whether the motion vector predictor isgenerated using the four parameters or the six parameters.

In the present disclosure, the affine flag and the affine parameter flagmay be defined at at least one level of a slice, a largest coding unit,a coding unit and a prediction unit.

In the present disclosure, the method may further include checkingwhether the size of the current block satisfies a predeterminedcondition, wherein the predetermined condition represents whether atleast one of the number of pixels in the current block and width and/orthe height of the current block is greater than a predeterminedthreshold value, and the checking of whether the AF mode is applied tothe current block is performed when the size of the current blocksatisfies the predetermined condition.

In the present disclosure, the current block may be decoded based on acoding mode other than the AF mode when the size of the current blockdoes not satisfy the predetermined condition.

In the present disclosure, the method may further include checkingwhether the AF mode has been applied to a neighbor block when the AFmode is applied to the current block, wherein the motion vectorpredictor is generated using the four parameters when the AF mode hasbeen applied to the neighbor block, and the checking of whether the AF4mode is used is performed when the AF mode has not been applied to theneighbor block.

The present disclosure provides an apparatus for decoding a video signalincluding a current block based on an affine motion prediction mode (AFmode), the apparatus including an inter prediction unit configured to:check whether the AF mode is applied to the current block; check whetheran AF4 mode is used when the AF mode is applied to the current block;generate a motion vector predictor using four parameters when the AF4mode is used and generate a motion vector predictor using six parametersconstituting an affine motion model when the AF4 mode is not used; andobtain a motion vector of the current block based on the motion vectorpredictor, wherein the AF mode represents a motion prediction mode usingthe affine motion model, and the AF4 mode represents a mode in which amotion vector is predicted using four parameters constituting the affinemotion model.

In the present disclosure, the apparatus may further include a parserconfigured to parse an affine flag from the video signal, wherein theaffine flag indicates whether the AF mode is applied to the currentblock, and whether the AF mode is applied to the current block ischecked based on the affine flag.

In the present disclosure, the apparatus may include the parserconfigured to obtain an affine parameter flag from the video signal whenthe AF mode is applied to the current block according to the affineflag, wherein the affine parameter flag indicates whether the motionvector predictor is generated using the four parameters or the sixparameters.

In the present disclosure, the apparatus may include the interprediction unit configured to check whether the size of the currentblock satisfies a predetermined condition, wherein the predeterminedcondition represents whether at least one of the number of pixels in thecurrent block and width and/or the height of the current block isgreater than a predetermined threshold value, and the checking ofwhether the AF mode is applied to the current block is performed whenthe size of the current block satisfies the predetermined condition.

In the present disclosure, the apparatus may include the interprediction unit configured to check whether the AF mode has been appliedto a neighbor block when the AF mode is applied to the current block,wherein the motion vector predictor is generated using the fourparameters when the AF mode has been applied to the neighbor block, andthe checking of whether the AF4 mode is used is performed when the AFmode has not been applied to the neighbor block.

DETAILED DESCRIPTION

Hereinafter, a configuration and operation of an embodiment of thepresent disclosure will be described in detail with reference to theaccompanying drawings, a configuration and operation of the presentdisclosure described with reference to the drawings are described as anembodiment, and the scope, a core configuration, and operation of thepresent disclosure are not limited thereto.

Further, terms used in the present disclosure are selected fromcurrently widely used general terms, but in a specific case, randomlyselected terms by an applicant are used. In such a case, in a detaileddescription of a corresponding portion, because a meaning thereof isclearly described, the terms should not be simply construed with only aname of terms used in a description of the present disclosure and ameaning of the corresponding term should be comprehended and construed.

Further, when there is a general term selected for describing thedisclosure or another term having a similar meaning, terms used in thepresent disclosure may be replaced for more appropriate interpretation.For example, in each coding process, a signal, data, a sample, apicture, a frame, and a block may be appropriately replaced andconstrued. Further, in each coding process, partitioning, decomposition,splitting, and division may be appropriately replaced and construed.

FIG. 1 shows a schematic block diagram of an encoder for encoding avideo signal, in accordance with one embodiment of the presentdisclosure.

Referring to FIG. 1 , an encoder 100 may include an image segmentationunit 110, a transform unit 120, a quantization unit 130, adequantization unit 140, an inverse transform unit 150, a filtering unit160, a DPB (Decoded Picture Buffer) 170, an inter-prediction unit 180,an intra-prediction unit 185 and an entropy-encoding unit 190.

The image segmentation unit 110 may divide an input image (or, apicture, a frame) input to the encoder 100 into one or more processunits. For example, the process unit may be a coding tree unit (CTU), acoding unit (CU), a prediction unit (PU), or a transform unit (TU).Here, segmentation may be performed by at least one of QT (QuadTree), BT(Binary Tree), TT (Ternary Tree) and AT (Asymmetric Tree).

However, the terms are used only for convenience of illustration of thepresent disclosure, the present disclosure is not limited to thedefinitions of the terms. In this specification, for convenience ofillustration, the term “coding unit” is employed as a unit used in aprocess of encoding or decoding a video signal, however, the presentdisclosure is not limited thereto, another process unit may beappropriately selected based on contents of the present disclosure.

The encoder 100 may generate a residual signal by subtracting aprediction signal output from the inter-prediction unit 180 or intraprediction unit 185 from the input image signal. The generated residualsignal may be transmitted to the transform unit 120.

The transform unit 120 may apply a transform technique to the residualsignal to produce a transform coefficient. The transform process may beapplied to a pixel block having the same size of a square, or to a blockof a variable size other than a square.

The quantization unit 130 may quantize the transform coefficient andtransmits the quantized coefficient to the entropy-encoding unit 190.The entropy-encoding unit 190 may entropy-code the quantized signal andthen output the entropy-coded signal as bit streams.

The quantized signal output from the quantization unit 130 may be usedto generate a prediction signal. For example, the quantized signal maybe subjected to a dequantization and an inverse transform via thedequantization unit 140 and the inverse transform unit 150 in the looprespectively to reconstruct a residual signal. The reconstructedresidual signal may be added to the prediction signal output from theinter-prediction unit 180 or intra-prediction unit 185 to generate areconstructed signal.

Meanwhile, in the compression process, adjacent blocks may be quantizedby different quantization parameters, so that deterioration of the blockboundary may occur. This phenomenon is called blocking artifacts. Thisis one of important factors for evaluating image quality. A filteringprocess may be performed to reduce such deterioration. Using thefiltering process, the blocking deterioration may be eliminated, and, atthe same time, an error of a current picture may be reduced, therebyimproving the image quality.

The filtering unit 160 may apply filtering to the reconstructed signaland then outputs the filtered reconstructed signal to a reproducingdevice or the decoded picture buffer 170. The filtered signaltransmitted to the decoded picture buffer 170 may be used as a referencepicture in the inter-prediction unit 180. In this way, using thefiltered picture as the reference picture in the inter-pictureprediction mode, not only the picture quality but also the codingefficiency may be improved.

The decoded picture buffer 170 may store the filtered picture for use asthe reference picture in the inter-prediction unit 180.

The inter-prediction unit 180 may perform temporal prediction and/orspatial prediction with reference to the reconstructed picture to removetemporal redundancy and/or spatial redundancy. In this case, thereference picture used for the prediction may be a transformed signalobtained via the quantization and dequantization on a block basis in theprevious encoding/decoding. Thus, this may result in blocking artifactsor ringing artifacts.

Accordingly, in order to solve the performance degradation due to thediscontinuity or quantization of the signal, the inter-prediction unit180 may interpolate signals between pixels on a subpixel basis using alow-pass filter. In this case, the subpixel may mean a virtual pixelgenerated by applying an interpolation filter. An integer pixel means anactual pixel existing in the reconstructed picture. The interpolationmethod may include linear interpolation, bi-linear interpolation andWiener filter, etc.

The interpolation filter is applied to a reconstructed picture, and thuscan improve the precision of a prediction. For example, the interprediction unit 180 may generate an interpolated pixel by applying theinterpolation filter to an integer pixel, and may perform a predictionusing an interpolated block configured with interpolated pixels as aprediction block.

The intra prediction unit 185 may predict a current block with referenceto samples peripheral to a block to be now encoded. The intra predictionunit 185 may perform the following process in order to perform intraprediction. First, the prediction unit may prepare a reference samplenecessary to generate a prediction signal. Furthermore, the predictionunit may generate a prediction signal using the prepared referencesample. Thereafter, the prediction unit encodes a prediction mode. Inthis case, the reference sample may be prepared through reference samplepadding and/or reference sample filtering. The reference sample mayinclude a quantization error because a prediction and reconstructionprocess has been performed on the reference sample. Accordingly, inorder to reduce such an error, a reference sample filtering process maybe performed on each prediction mode used for intra prediction.

The prediction signal generated through the inter prediction unit 180 orthe intra prediction unit 185 may be used to generate a reconstructedsignal or may be used to generate a residual signal.

FIG. 2 is an embodiment to which the present disclosure is applied andshows a schematic block diagram of a decoder by which the decoding of avideo signal is performed.

Referring to FIG. 2 , the decoder 200 may be configured to include aparsing unit (not shown), an entropy decoding unit 210, a dequantizationunit 220, an inverse transform unit 230, a filtering unit 240, a decodedpicture buffer (DPB) unit 250, an inter prediction unit 260, an intraprediction unit 265 and a reconstruction unit (not shown).

The decoder 200 may receive a signal output by the encoder 100 of FIG. 1, and may parse or obtain a syntax element through the parsing unit (notshown). The parsed or obtained signal may be entropy-decoded through theentropy decoding unit 210.

The dequantization unit 220 obtains a transform coefficient from theentropy-decoded signal using quantization step size information.

The inverse transform unit 230 obtains a residual signal by inverselytransforming the transform coefficient.

The reconstruction unit (not shown) generates a reconstructed signal byadding the obtained residual signal to a prediction signal output by theinter prediction unit 260 or the intra prediction unit 265.

The filtering unit 240 applies filtering to the reconstructed signal andtransmits the filtered signal to a playback device or transmits thefiltered signal to the decoded picture buffer unit 250. The filteredsignal transmitted to the decoded picture buffer unit 250 may be used asa reference picture in the inter prediction unit 260.

In this specification, the embodiments described in the filtering unit160, inter prediction unit 180 and intra prediction unit 185 of theencoder 100 may be identically applied to the filtering unit 240, interprediction unit 260 and intra prediction unit 265 of the decoder,respectively.

A reconstructed video signal output through the decoder 200 may beplayed back through a playback device.

FIG. 3 is a diagram for explaining a QT (QuadTree, referred to as ‘QT’hereinafter) block segmentation structure as an embodiment to which thepresent disclosure is applicable.

In video coding, a single block can be segmented based on QT (QuadTree).Further, a single subblock segmented according to QT can be furtherrecursively segmented using QT. A leaf block that is QT segmented nolonger can be segmented according to at least one of BT (Binary Tree),TT (Ternary Tree) and AT (Asymmetric Tree). BT can have two types ofsegmentation: horizontal BT (2N×N, 2N×N); and vertical BT (N×2N, N×2N).TT can have two types of segmentation: horizontal TT (2N×1/2N, 2N×N,2N×1/2N); and vertical TT (1/2N×2N, N×2N, 1/2N×2N). AT can have fourtypes of segmentation: horizontal-up AT (2N×1/2N, 2N×3/2N);horizontal-down AT (2N×3/2N, 2N×1/2N), vertical-left AT (1/2N×2N,3/2N×2N); and vertical-right AT (3/2N×2N, 1/2N×2N). BT, TT and AT may befurther recursively segmented using BT, TT and AT.

FIG. 3 shows an example of QT segmentation. A block A can be segmentedinto four subblocks A0, A1, A2 and A3 using QT. The subblock A1 can befurther segmented into four subblocks B0, B1, B2 and B3 using QT.

FIG. 4 is a diagram for explaining a BT (Binary Tree, referred to as‘BT’ hereinafter) block segmentation structure as an embodiment to whichthe present disclosure is appliable.

FIG. 4 shows an example of BT segmentation. A block B3 that is segmentedno longer by QT can be segmented into vertical BT C0 and C1 orhorizontal BT D0 and D1. Each subblock such as the block C0 can befurther recursively segmented into horizontal BT E0 and E1 or verticalBT F0 and F1.

FIG. 5 is a diagram for explaining a TT (Ternary Tree, referred to as‘TT’ hereinafter) block segmentation structure as an embodiment to whichthe present disclosure is appliable.

FIG. 5 shows an example of TT segmentation. A block B3 that is segmentedno longer by QT can be segmented into vertical TT C0, C1 and C2 orhorizontal TT D0, D1 and D2. Each subblock such as the block C1 can befurther recursively segmented into horizontal TT E0, E1 and E2 orvertical TT F0, F1 and F2.

FIG. 6 is a diagram for explaining an AT (Asymmetric Tree, referred toas ‘AT’ hereinafter) block segmentation structure as an embodiment towhich the present disclosure is appliable.

FIG. 6 shows an example of AT segmentation. A block B3 that is segmentedno longer by QT can be segmented into vertical AT C0 and C1 orhorizontal AT D0 and D1. Each subblock such as the block C1 can befurther recursively segmented into horizontal AT E0 and E1 or verticalAT F0 and F1.

Meanwhile, BT, TT and AT segmentations can be used together. Forexample, a subblock segmented by BT can be segmented by TT or AT.Further, a subblock segmented by TT can be segmented by BT or AT. Asubblock segmented by AT can be segmented by BT or TT. For example, eachsubblock may be segmented into vertical BT after horizontal BTsegmentation and each subblock may be segmented into horizontal BT aftervertical BT segmentation. These two segmentation methods have differentsegmentation orders but finally segmented shapes obtained thereby areidentical.

In addition, when a block is segmented, a block search order can bedefined in various manners. In general, search is performed from left toright and from top to bottom, and block search may mean the order ofdetermining additional block segmentation of each segmented subblock,encoding order of each subblock when the block is segmented no longer,or search order when a subblock refers to information on other neighborblocks.

FIGS. 7A and 7B are diagram for explaining an inter-prediction mode asan embodiment to which the present disclosure is applied.

Inter-Prediction Mode

In an inter-prediction mode to which the present disclosure is applied,a merge mode, an AMVP (Advanced Motion Vector Prediction) mode of anaffine prediction mode (hereinafter referred to as ‘AF mode’) may beused in order to reduce the quantity of motion information.

1) Merge Mode

The merge mode refers to a method of deriving motion parameters (orinformation) from a spatially or temporally neighbor block.

A set of candidates available in the merge mode includes spatialneighbor candidates, temporal candidates and generated candidates.

Referring to FIG. 7A, whether each spatial candidate block is availablein the order of {A1, B1, B0, A0, B2}. Here, when a candidate block isencoded in an intra-prediction mode and thus there is no motioninformation or the candidate block is located outside a current picture(or slice), the candidate block cannot be used.

After determination of validity of spatial candidates, spatial mergecandidates can be configured by excluding unnecessary candidate blocksfrom candidate blocks of a current processing block. For example, when acandidate block of a current predictive block is the first predictiveblock in the same coding block, the candidate block can be excluded andcandidate blocks having the same motion information can be excluded.

When spatial merge candidate configuration is completed, a temporalmerge candidate configuration process is performed in the order of {T0,T1}.

In temporal candidate configuration, if a right bottom block T0 of acollocated block of a reference picture is available, the correspondingblock is configured as a temporal merge candidate. The collocated blockrefers to a block present at a position in the selected referencepicture, which corresponds to a current processing block. If not, ablock T1 located at the center of the collocated block is configured asa temporal merge candidate.

A maximum number of merge candidates may be specified in a slice header.If the number of merge candidates is greater than the maximum number,smaller numbers of spatial candidates and temporal candidate than themaximum number are maintained. If not, candidates added so far arecombined to generate additional merge candidates (i.e., combinedbi-predictive merging candidates) until the number of merge candidatesreaches the maximum number.

An encoder configures a merge candidate list through the above methodand performs motion estimation to signal information on a candidateblock selected from the merge candidate list to a decoder as a mergeindex (e.g., merge_idx[x0][y0]′). FIG. 7B illustrates a case in which ablock B1 is selected from the merge candidate list. In this case, “index1” can be signaled to the decoder as a merge index.

The decoder configures a merge candidate list as in the encoder andderives motion information on a current block from motion information ofa candidate block corresponding to the merge index received from theencoder in the merge candidate list. In addition, the decoder generatesa predictive block with respect to a current processing block based onthe derived motion information.

2) AMVP (Advanced Motion Vector Prediction) Mode

The AMVP mode refers to a method of deriving a motion vector predictionvalue from a neighbor block. Accordingly, horizontal and vertical motionvector differences (MVDs), a reference index and an inter-predictionmode are signaled to a decoder. Horizontal and vertical motion vectorvalues are calculated using a derived motion vector prediction value anda motion vector difference (MVD) provided by an encoder.

That is, the encoder configures a motion vector prediction valuecandidate list and performs motion estimation to signal a motionreference flag (i.e., candidate block information) (e.g.,mvp_IX_flag[x0][y0]′) selected from the motion vector prediction valuecandidate list to the decoder. The decoder configures a motion vectorprediction value candidate list as in the encoder and derives a motionvector prediction value of a current processing block using motioninformation of a candidate block indicated by the motion reference flagreceived from the encoder in the motion vector prediction valuecandidate list. In addition, the decoder obtains a motion vector valueof the current processing block using the derived motion vectorprediction value and the motion vector difference transmitted from theencoder. Then, the decoder generates a predictive block with respect tothe current processing block based on derived motion information (i.e.,motion compensation).

In the case of the AMVP mode, two spatial motion candidates are selectedfrom five available candidates in FIGS. 7A and 7B. The first spatialmotion candidate is selected from a left set {A0, A1} and the secondspatial motion candidate is selected from a top set {B0, B1, B2}. Here,motion vectors are scaled when a reference index of a neighbor candidateblock is not the same as that of a current predictive block.

If the number of candidates selected as a spatial motion candidatesearch result is 2, candidate configuration is ended. If the number isless than 2, temporal motion candidates are added.

The decoder (e.g., an inter-prediction unit) decodes motion parameterswith respect to a processing block (e.g., a prediction unit).

For example, when the processing block uses the merge mode, the decodercan decode a merge index signaled from the encoder. Then, the decodercan derive motion parameters of the current processing block from motionparameters of a candidate block indicated by the merge index.

Furthermore, when the AMVP mode is applied to the processing block, thedecoder can decode horizontal and vertical motion vector differences(MVDs), a reference index and an inter-prediction mode signaled from theencoder. In addition, the decoder can derive a motion vector predictionvalue from motion parameters of a candidate block indicated by a motionreference flag and derive a motion vector value of the currentprocessing block using the motion vector prediction value and thereceived motion vector differences.

The decoder performs motion compensation with respect to a predictionunit using decoded motion parameters (or information).

That is, the encoder/decoder performs motion compensation for predictingan image of a current unit from a previously decoded picture usingdecoded motion parameters.

3) AF Mode (Affine Mode)

The AF mode refers to a motion prediction mode using an affine motionmodel and may include at least one of an affine merge mode and an affineinter mode. The affine inter mode may include at least one of an AF4mode and an AF6 mode. Here, the AF4 mode represents a four parameteraffine prediction mode using four parameters and the AF6 mode representsa six parameter affine prediction mode using six parameters.

Although the ARF4 mode or the AF6 mode is represented in the presentdisclosure for convenience, the AF4 mode and the AF6 mode need not bedefined as separate prediction modes and can be distinguished from eachother according to whether four parameters are used or six parametersare used.

The AF modes will be described in detail with reference to FIGS. 8 to 10.

FIG. 8 is a diagram for explaining an affine motion model as anembodiment to which the present disclosure is applied.

General image coding techniques use a translation motion model torepresent a motion of a coding block. Here, the translation motion modelrepresents a prediction method based on a translated block. That is,motion information of a coding block is represented using a singlemotion vector. However, pixels may have different optimal motion vectorsin an actual coding block. If an optimal motion vector can be determinedper pixel or subblock using a small amount of information, codingefficiency can be improved.

Accordingly, the present disclosure proposes an inter-prediction basedimage processing method reflecting various motions of an image as wellas a prediction method based on a translated block in order to improveinter-prediction performance.

In addition, the present disclosure proposes an affine motion predictionmethod for performing encoding/decoding using an affine motion model.The affine motion model represents a prediction method of deriving amotion vector in units of pixel or subblock using a control point motionvector. In the description, an affine motion prediction mode using theaffine motion model is referred to as an AF mode (affine mode).

Furthermore, the present disclosure provides a method for adaptivelyperforming affine prediction based on a block size.

Furthermore, the present disclosure provides a method for adaptivelyperforming affine prediction based on whether a neighbor block has beencoded according to affine prediction.

Moreover, the present disclosure provides a method for adaptivelydetermining (or selecting) an optimal coding mode based on at least oneof the AF4 mode and the AF6 mode. Here, the AF4 mode represents a fourparameter affine prediction mode using four parameters and the AF6 moderepresents a six parameter affine prediction mode using six parameters.

Referring to FIG. 8 , various methods can be used to representdistortion of an image as motion information, and particularly, theaffine motion model can represent four motions illustrated in FIG. 8 .

For example, the affine motion model can model any image distortionincluding translation of an image, scaling of an image, rotation of animage and shearing of an image.

Although the affine motion model can be represented through variousmethods, the present disclosure proposes a method for displaying (oridentifying) distortion using motion information at a specific referencepoint (or reference pixel/sample) of a block and performinginter-prediction using the same. Here, a reference point may be referredto as a control point (CP) (or a control pixel or a control sample) anda motion vector at the reference point may be referred to as a controlpoint motion vector (CPMV). A degree of distortion that can berepresented may depend on the number of control points.

The affine motion model can be represented using six parameters a, b, c,d, e and f as represented by Equation 1 below.

$\begin{matrix}\left\{ \begin{matrix}{v_{x} = {{a*x} + {b*y} + c}} \\{v_{y} = {{d*x} + {e*y} + f}}\end{matrix} \right. & \left\lbrack {{Equation}1} \right\rbrack\end{matrix}$

Here, (x,y) represents the position of a left top pixel of a codingblock. In addition, v_x and v_y represent motion vectors at (x,y).

FIGS. 9A and 9B are diagrams for explaining an affine motion predictionmethod using a control point motion vector as an embodiment to which thepresent disclosure is applied.

Referring to FIG. 9A, a left top control point CP₀ 902 (hereinafterreferred to as a first control point), a right top control point CP₁ 903(hereinafter referred to as a second control point), and a left bottomcontrol point CP₂ 904 (hereinafter referred to as a third control point)of a current block 901 may have independent pieces of motioninformation. These can be represented as CP₀, CP₁ and CP₂. However, thiscorresponds to an embodiment of the present disclosure and the presentdisclosure is not limited thereto. For example, control points may bedefined in various manners as a right bottom control point, a centercontrol point and other control points for positions of subblocks.

In an embodiment of the present disclosure, at least one of the first tothird control points may be a pixel included in the current block.Alternatively, at least one of the first to third control points may bea pixel that is not included in the current block and neighbors thecurrent block.

Motion information per pixel or subblock of the current block 901 can bederived using motion information of one or more of the aforementionedcontrol points.

For example, an affine motion model using motion vectors of the left topcontrol point 902, the right top control point 903 and the left bottomcontrol point 904 of the current block 901 can be defined as representedby Equation 2 below.

$\begin{matrix}\left\{ \begin{matrix}{v_{x} = {{\frac{\left( {v_{1x} - v_{0x}} \right)}{w}*x} + {\frac{\left( {v_{2x} - v_{0x}} \right)}{h}*x} + v_{0x}}} \\{v_{y} = {{\frac{\left( {v_{1y} - v_{0y}} \right)}{w}*x} - {\frac{\left( {v_{2y} - v_{0y}} \right)}{h}*y} + v_{0y}}}\end{matrix} \right. & \left\lbrack {{Equation}2} \right\rbrack\end{matrix}$

Here, when {right arrow over (v)}₀ represents the motion vector of theleft top control point, {right arrow over (v)}₁ represents the motionvector of the right top control point 903, and {right arrow over (v)}₂represents the motion vector of the left bottom control point 904, thesemotion vectors can be defined as {right arrow over(v)}₀={v_(0x),v_(0y)}, {right arrow over (v)}₁={v_(1x),v_(1y)} and{right arrow over (v)}₂={v_(2x),v_(2y)}. Further, in Equation 2, wrepresents the width of the current block 901 and h represent the heightof the current block 901. In addition, {right arrow over (v)}={v_(x),v_(y)} represents motion vectors at {x,y}.

The present disclosure can define an affine motion model that representsthree motions of translation, scale and rotation from among motions thatcan be represented by affine motion models. This is referred to as asimplified affine motion model or a similarity affine motion model inthe description.

The simplified affine motion model can be represented using fourparameters a, b, c and d as represented by Equation 3 below.

$\begin{matrix}\left\{ \begin{matrix}{v_{x} = {{a*x} - {b*y} + c}} \\{v_{y} = {{b*x} + {a*y} + d}}\end{matrix} \right. & \left\lbrack {{Equation}3} \right\rbrack\end{matrix}$

Here, {v_(x),v_(y)} represents motion vectors at {x,y}. The affinemotion model using the four parameters may be referred to as AF4. Thepresent disclosure is not limited thereto, a case in which sixparameters are used is referred to as AF6 and the above-describedembodiments can be equally applied thereto.

Referring to FIG. 9B, when {right arrow over (v)}₀ represents a motionvector of a left top control point 1001 of a current block and {rightarrow over (v)}₁ represents a motion vector of a right top control point1002, these motion vectors can be defined as {right arrow over(v)}₀={v_(0x), v_(0y)} and {right arrow over (v)}₁={v_(1x), v_(1y)}.Here, the affine motion model of AF4 may be defined as represented byEquation 4 below.

$\begin{matrix}\left\{ \begin{matrix}{v_{x} = {{\frac{\left( {v_{1x} - v_{0x}} \right)}{w}*x} - {\frac{\left( {v_{1y} - v_{0y}} \right)}{w}*y} + v_{0x}}} \\{v_{y} = {{\frac{\left( {v_{1y} - v_{0y}} \right)}{w}*x} - {\frac{\left( {v_{1x} - v_{0x}} \right)}{w}*y} + v_{0y}}}\end{matrix} \right. & \left\lbrack {{Equation}4} \right\rbrack\end{matrix}$

In Equation 4, w represents the width of the current block and hrepresents the height of the current block. In addition, {right arrowover (v)}={v_(x), v_(y)} represents motion vectors at {x, y}.

An encoder or a decoder can determine (or derive) a motion vector ofeach pixel position using control point motion vectors (e.g., the motionvectors of the left top control point 1001 and the right top controlpoint 1002).

In the present disclosure, a set of motion vectors determined throughaffine motion prediction can be defined as an affine motion vectorfield. The affine motion vector field can be determined using at leastone of Equations 1 to 4.

In an encoding/decoding process, a motion vector through affine motionprediction can be determined in units of pixel or predetermined (orpreset) block (or subblock). For example, a motion vector can be derivedbased on each pixel in a block when a motion vector is determined inunits of pixel, and a motion vector can be derived based on eachsubblock in a current block when a motion vector is determined in unitsof subblock. Alternatively, when a motion vector is determined in unitsof subblock, a motion vector of a corresponding subblock can be derivedbased on the left top pixel or the center pixel.

Hereinafter, although a case in which a motion vector through affinemotion prediction is determined in units of 4×4 blocks will be chieflydescribed in present disclosure for convenience of description, thepresent disclosure is not limited thereto and the present disclosure maybe applied in units of pixel or in units of block having a differentsize.

Meanwhile, referring to FIG. 9B, a case in which the size of the currentblock is 16×16 is assumed. The encoder or decoder can determine motionvectors in units of 4×4 subblocks using motion vectors of the left topcontrol point 1001 and the right top control point 1002 of the currentblock. In addition, a motion vector of a subblock can be determinedbased on the center pixel value of the subblock.

In FIG. 9B, an arrow indicated at the center of each subblock representsa motion vector obtained by the affine motion model.

Affine motion prediction can be used in an affine merge mode(hereinafter referred to as an ‘AF merge mode’) and an affine inter mode(hereinafter referred to as an ‘AF inter mode’). The AF merge mode is amethod of deriving two control point motion vectors and encoding ordecoding the same without decoding a motion vector difference similarlyto the skip mode or the merge mode. The AF inter mode is a method ofdetermining a motion vector predictor and a control point motion vectorand then encoding or decoding a control point motion vector difference(CPMVD) corresponding to a difference between the motion vectorpredictor and the control point motion vector. In this case, two controlpoint motion vector differences are transmitted in the case of the AF4mode and three control point motion vector differences are transmittedin the case of the AF6 mode.

Here, the AF4 mode has the advantage that it can represents a controlpoint motion vector (CPMV) using a small number of bits because the AF4mode transmits a smaller number of motion vector differences than thatof the AF6 mode, whereas the AF6 mode has the advantage that it canreduce the number of bits for residual coding because the AF6 modetransmits three CPMVDs and thus can generate an excellent predictor.

Therefore, the present disclosure proposes a method of considering boththe AF4 mode and the AF6 mode (or simultaneously) in the AF inter mode.

FIG. 10 is a flowchart illustrating a process of processing a videosignal including a current block using an affine prediction mode(hereinafter referred to as an ‘AF mode’) as an embodiment to which thepresent disclosure is applied.

The present disclosure provides a method for processing a video signalincluding a current block using the AF mode.

First, a video signal processing apparatus may generate a candidate listof motion vector pairs using motion vectors of pixels or blocksneighboring at least two control points of the current block (S1010).Here, the control points may refer to corner pixels of the current blockand the motion vector pairs may include motion vectors of a left topcorner pixel and a right top corner pixel of the current block.

In an embodiment, the control points may include at least two of theleft top corner pixel, the right top corner pixel, the left bottomcorner pixel and the right bottom corner pixel, and the candidate listmay include pixels or blocks neighboring the left top corner pixel, theright top corner pixel and the left bottom corner pixel.

In an embodiment, the candidate list may be generated based on motionvectors of a diagonal neighbor pixel A, an upper neighbor pixel B and aleft neighbor pixel C of the left top corner pixels, motion vectors ofan upper neighbor pixel D and a diagonal neighbor pixel E of the righttop corner pixel, and motion vectors of a left neighbor pixel F and adiagonal neighbor pixel G of the left bottom corner pixel.

In an embodiment, the aforementioned method may further include a stepof adding an AMVP candidate list to the candidate list when the numberof motion vector pairs included in the candidate list is less than 2.

In an embodiment, a control point motion vector of the current block maybe determined as a motion vector derived based on the centers of a leftsubblock and a right subblock in the current block when the currentblock has a size of N×4, and the control point motion vector of thecurrent block may be determined as a motion vector derived based on thecenters of a top subblock and a bottom subblock in the current blockwhen the current block has a size of 4×N.

In an embodiment, a control point motion vector of the left subblock inthe current block is determined by the average of a first control pointmotion vector and a third control point motion vector and a controlpoint motion vector of the right subblock is determined by the averageof a second control point motion vector and a fourth control pointmotion vector when the current block has a size of N×4, whereas acontrol point motion vector of the top subblock in the current block isdetermined by the average of the first control point motion vector andthe second control point motion vector and a control point motion vectorof the bottom subblock is determined by the average of the third controlpoint motion vector and the fourth control point motion vector when thecurrent block has a size of 4×N.

In another embodiment, the aforementioned method may signal a predictionmode or flag information indicating whether the AF mode is executed.

In this case, the video signal processing apparatus may receive theprediction mode or the flag information, execute the AF mode accordingto the prediction mode or the flag information and derive a motionvector according to the AF mode. Here, the AF mode represents a mode ofderiving a motion vector in units of pixel or subblock using controlpoint motion vectors of the current block.

Meanwhile, the video signal processing apparatus may determine a finalcandidate list of a predetermined number of motion vector pairs based ondivergence values of the motion vector pairs (S1020). Here, the finalcandidate list is determined in ascending order of divergence values,and a divergence value refers to a value indicating similarity ofdirections of motion vectors.

The video signal processing apparatus may determine a control pointmotion vector of the current block based on a rate-distortion cost fromthe final candidate list (S1030).

The video signal processing apparatus may generate a motion vectorpredictor of the current block based on the control point motion vector(S1040).

FIG. 11 is a flowchart illustrating a process of adaptively determiningan optimal coding mode based on at least one of the AF4 mode and the AF6mode as an embodiment (1-1) to which the present disclosure is applied.

The video signal processing apparatus may perform prediction based on atleast one of a skip mode, a merge mode and an inter mode (S1110). Here,the merge mode may include the aforementioned AF merge mode as well asthe normal merge mode and the inter mode may include the aforementionedAF inter mode as well as the normal inter mode.

The video signal processing apparatus may perform motion vectorprediction based on at least one of the AF4 mode and the AF6 mode(S1120). Here, step S1110 and step S1120 are not limited to the orderthereof.

The video signal processing apparatus may determine an optimal codingmode from among the aforementioned modes by comparing results of stepS1120 (S1130). Here, the results of step S1120 may be compared based onrate-distortion cost.

Then, the video signal processing apparatus may generate a motion vectorpredictor of the current block based on the optimal coding mode andobtain a motion vector difference by subtracting the motion vectorpredictor from the motion vector of the current block.

Thereafter, the encoding/decoding processes described in FIGS. 1 and 2may be equally applied.

FIG. 12 is a flowchart illustrating a process of adaptively performingdecoding based on the AF4 mode or the AF6 mode as an embodiment (1-2) towhich the present disclosure is applied.

A decoder may receive a bitstream (S1210). The bitstream may includeinformation about a coding mode of a current block in a video signal.

The decoder may check whether the coding mode of the current block is anAF mode (S1220). Here, the AF mode refers to an affine motion predictionmode using an affine motion model and may include at least one of theaffine merge mode and the affine inter mode, for example, and the affineinter mode may include at least one of the AF4 mode and the AF6 mode.

Here, step S1220 may be checked by an affine flag indicating whether theAF mode is executed. For example, the affine flag may be represented byaffine_flag. When affine_flag=1, this represents the AF mode is executedon the current block. When affine_flag=0, this represents that the AFmode is not executed on the current block.

When the AF mode is not executed on the current block, the decoder mayperform decoding (i.e., motion vector prediction) according to a codingmode other than the AF mode (S1230). For example, the skip mode, themerge mode or the inter mode may be used.

When the AF mode is executed on the current block, the decoder may checkwhether the AF4 mode is applied to the current block (S1240).

Here, step S1240 may be checked by an affine parameter flag indicatingwhether the AF4 mode is executed (or whether affine motion prediction isperformed using four parameters). For example, the affine parameter flagmay be represented by affine_param_flag. When affine_param_flag=0, thisrepresents that motion vector prediction is performed according to theAF4 mode (S1250). When affine_param_flag=1, this represents that motionvector prediction is performed according to the AF6 mode (S1260).However, the present disclosure is not limited thereto.

For example, the affine parameter flag may include at least one ofAF4_flag and AF6_flag.

AF4_flag indicates whether the AF4 mode is executed on the currentblock. The AF4 mode is executed on the current block when AF4_flag=1 andthe AF4 mode is not executed on the current block when AF4_flag=0. Here,execution of the AF4 mode means execution of motion vector predictionusing an affine motion model represented by four parameters.

AF6_flag indicates whether the AF6 mode is executed on the currentblock. The AF6 mode is executed on the current block when AF6_flag=1 andthe AF6 mode is not executed on the current block when AF6_flag=0. Here,execution of the AF6 mode means execution of motion vector predictionusing an affine motion model represented by four parameters.

The affine flag and the affine parameter flag may be defined at at leastone level of a slice, a largest coding unit, a coding unit and aprediction unit.

For example, at least one of AF_flag, AF4_flag and AF6_flag may bedefined at the slice level and additionally defined at the block levelor prediction unit level.

FIG. 13 illustrates a syntax structure in which decoding is performedbased on the AF4 mode or the AF6 mode as an embodiment (1-3) to whichthe present disclosure is applied.

A decoder may obtain merge_flag and check whether the merge mode isapplied to the current block (S1310).

When the merge mode is not applied to the current block, the decoder mayobtain affine_flag (S1320). Here, affine_flag indicates whether the AFmode is executed.

When affine_flag=1, that is, when the AF mode is executed on the currentblock, the decoder may obtain affine_param_flag (S1330). Here,affine_param_flag indicates whether the AF4 mode is executed (or whetheraffine motion prediction is executed using four parameters).

When affine_param_flag=0, that is, when motion vector prediction isexecuted according to the AF4 mode, the decoder may obtain two motionvector differences of mvd_CP0 and mvd_CP1 (S1340). Here, mvd_CP0indicates a motion vector difference with respect to control point 0 andmvd_CP1 indicates a motion vector difference with respect to controlpoint 1.

When affine_param_flag=1, that is, when motion vector prediction isexecuted according to the AF6 mode, the decoder may obtain three motionvector differences of mvd_CP0, mvd_CP1 and mvd_CP2 (S1350).

FIG. 14 is a flowchart illustrating a process of adaptively determiningan optimal coding mode from among motion vector prediction modesincluding the AF4 mode or the AF6 mode based on condition A as anembodiment (2-1) to which present disclosure is applied.

An encoder may execute prediction based on at least one of the skipmode, the merge mode and the inter mode (S1410).

The encoder may check whether condition A is satisfied for the currentblock in order to determine an optimal coding mode for motion vectorprediction (S1420).

Here, condition A may refer to a condition with respect to a block size.For example, embodiments of Table 1 below may be applied.

TABLE 1 CONDITION A TH1 value Example 1 pixNum (=width * height) > TH1TH1 = 64, 128, 256, 512, 1024, . . . Example 2 width > TH1 && height >TH1 TH1 = 4, 8, 16, 32, . . . Example 3 width > TH1 ∥ height > TH1 TH1 =4, 8, 16, 32, . . .

In Example 1 of Table 1, condition A represents whether the number ofpixels pixNum of the current block is greater than a threshold valueTH1. Here, the threshold value may be 64, 128, 256, 512, 1024, . . . .For example, TH1=64 represents that a block size is 4×16, 8×8 or 16×4and TH1=128 represents that a block size is 32×4, 16×8, 8×16 or 4×32.

Example 2 represents whether both the width and height of the currentblock are greater than the threshold value TH1.

Example 3 represents whether the width of the current block is greaterthan the threshold value TH1 or whether the height of the current blockis greater than the threshold value TH1.

When condition A is satisfied, the encoder may perform motion vectorprediction based on at least one of the AF4 mode and the AF6 mode(S1430).

The encoder may determine an optimal coding mode from among motionvector prediction modes including the AF4 mode or the AF6 mode bycomparing results of steps S1410 and S1430 (S1440).

On the other hand, when condition A is not satisfied, the encoder maydetermine an optimal coding mode from among modes other than the AF mode(S1440).

Then, the encoder may generate a motion vector predictor of the currentblock based on the optimal coding mode and obtain a motion vectordifference by subtracting the motion vector predictor from the motionvector of the current block.

Thereafter, the encoding/decoding processes described in FIGS. 1 and 2may be equally applied.

FIG. 15 is a flowchart illustrating a process of adaptively performingdecoding according to the AF4 mode or the AF6 mode based on condition Aas an embodiment (2-2) to which the present disclosure is applied.

A decoder may receive a bitstream (S1510). The bitstream may includeinformation about a coding mode of a current block in a video signal.

The decoder may check whether condition A is satisfied for the currentblock in order to determine an optimal coding mode for motion vectorprediction (S1520). Here, condition A may refer to a condition withrespect to a block size. For example, embodiments of Table 1 above maybe applied.

When condition A is satisfied, the decoder may check whether the codingmode of the current block is the AF mode (S1530). Here, the AF moderefers to an affine motion prediction mode using an affine motion modeland the embodiments described in the description may be applied.

Here, step S1530 may be checked by an affine flag indicating whether theAF mode is executed. For example, the affine flag may be represented byaffine_flag. When affine_flag=1, this represents that the AF mode isexecuted on the current block. When affine_flag=0, this represents thatthe AF mode is not executed on the current block.

When condition A is not satisfied or the AF mode is not executed on thecurrent block, the decoder may perform decoding (i.e., motion vectorprediction) according to a coding mode other than the AF mode (S1540).For example, the skip mode, the merge mode or the inter mode may beused.

When the AF mode is executed on the current block, the decoder may checkwhether the AF4 mode is applied to the current block (S1550).

Here, step S1550 may be checked by an affine parameter flag indicatingwhether the AF4 mode is executed (or whether affine motion prediction isexecuted using four parameters). For example, the affine parameter flagmay be represented by affine_param_flag. When affine_param_flag=0, thisrepresents that motion vector prediction is performed according to theAF4 mode (S1560). When affine_param_flag=1, this represents that motionvector prediction is performed according to the AF6 mode (S1570).However, the present disclosure is not limited thereto.

FIG. 16 illustrates a syntax structure in which decoding is performedaccording to the AF4 mode or the AF6 mode based on condition A as anembodiment (2-3) to which the present disclosure is applied.

A decoder may obtain merge_flag and check whether the merge mode isapplied to the current block (S1610).

When the merge mode is not applied to the current block, the decoder maycheck whether condition A is satisfied (S1620). Here, condition A mayrefer to a condition with respect to a block size. For example, theembodiments of Table 1 may be applied.

When condition A is satisfied, the decoder may obtain affine_flag(S1620). Here, affine_flag indicates whether the AF mode is executed.

When affine_flag=1, that is, when the AF mode is executed on the currentblock, the decoder may obtain affine_param_flag (S1630). Here,affine_param_flag indicates whether the AF4 mode is executed (or whetheraffine motion prediction is executed using four parameters).

When affine_param_flag=0, that is, when motion vector prediction isexecuted according to the AF4 mode, the decoder may obtain two motionvector differences of mvd_CP0 and mvd_CP1 (S1640). Here, mvd_CP0indicates a motion vector difference with respect to control point 0 andmvd_CP1 indicates a motion vector difference with respect to controlpoint 1.

In addition, when affine_param_flag=1, that is, when motion vectorprediction is executed according to the AF6 mode, the decoder may obtainthree motion vector differences of mvd_CP0, mvd_CP1 and mvd_CP2 (S1650).

FIG. 17 is a flowchart illustrating a process of adaptively determiningan optimal coding mode from among motion vector prediction modesincluding the AF4 mode or the AF6 mode as an embodiment (3-1) to whichthe present disclosure is applied.

The present disclosure provides a method for adaptively selecting theAF4 mode and the AF6 mode based on the size of the current block.

For example, one more motion vector difference is additionallytransmitted in the AF6 mode than in the AF4 mode, and thus the AF6 modeis effective for a relatively large block. Accordingly, encoding can beperformed in consideration of only the AF4 mode when the size of thecurrent block is less than (or equal to or less than) a predeterminedsize and encoding can be performed in consideration of only the AF6 modewhen the size of the current block is equal to or greater than (orgreater than) the predetermined size.

Meanwhile, in the case of a domain in which it is not determined thatonly one of the AF4 mode and the AF6 mode is clearly advantageous, boththe AF4 mode and the AF6 mode are considered and only an optimal modetherebetween can be signaled.

Referring to FIG. 17 , an encoder may execute prediction based on atleast one of the skip mode, the merge mode and the inter mode (S1710).

The encoder may check whether condition B is satisfied for the currentblock (S1720). Here, condition B may refer to a condition with respectto a block size. For example, embodiments of Table 2 below may beapplied.

TABLE 2 CONDITION B TH2 value Example 1 pixNum (=width * height) < TH2TH2 = 64, 128, 256, 512, 1024, . . . Example 2 width < TH2 && height <TH2 TH2 = 4, 8, 16, 32, . . . Example 3 width < TH2 ∥ height < TH2 TH2 =4, 8, 16, 32, . . .

In Example 1 of Table 2, condition B represents whether the number ofpixels pixNum of the current block is less than a threshold value TH2.Here, the threshold value may be 64, 128, 256, 512, 1024, . . . . Forexample, TH2=64 may represent that a block size is 4×16, 8×8 or 16×4 andTH2=128 may represent that a block size is 32×4, 16×8, 8×16 or 4×32.

Example 2 represents whether both the width and height of the currentblock are less than the threshold value TH2.

Example 3 represents whether the width of the current block is less thanthe threshold value TH2 or whether the height of the current block isless than the threshold value TH2.

When condition B is satisfied, the encoder may perform motion vectorprediction based on the AF4 mode (S1730).

When condition B is not satisfied, the encoder may check whethercondition C is satisfied for the current block (S1740). Here, conditionC may refer to a condition with respect to a block size. For example,embodiments of Table 3 below may be applied.

TABLE 3 CONDITION C TH3 value Example 1 pixNum (=width * height) ≥ TH3TH3 = 64, 128, 256, 512, 1024, . . . Example 2 width ≥ TH3 && height ≥TH3 TH3 = 4, 8, 16, 32, . . . Example 3 width ≥ TH3 ∥ height ≥ TH3 TH3 =4, 8, 16, 32, . . .

In Example 1 of Table 3, condition A represents whether the number ofpixels pixNum of the current block is equal to or greater than athreshold value TH3. Here, the threshold value may be 64, 128, 256, 512,1024, . . . . For example, TH3=64 may represent that a block size is4×16, 8×8 or 16×4 and TH3=128 may represent that a block size is 32×4,16×8, 8×16 or 4×32.

Example 2 represents whether both the width and height of the currentblock are equal to or greater than the threshold value TH3.

Example 3 represents whether the width of the current block is equal toor greater than the threshold value TH1 or whether the height of thecurrent block is equal to or greater than the threshold value TH1.

When condition C is satisfied, the encoder may perform motion vectorprediction based on the AF6 mode (S1760).

When condition C is not satisfied, the encoder may perform motion vectorprediction based on the AF4 mode and the AF6 mode (S1750).

Meanwhile, in condition B and condition C, the threshold values TH2 andTH3 may be determined such that they satisfy Equation 5 below.

TH_2≤TH_3  [Equation 5]

The encoder may determine an optimal coding mode from among motionvector prediction modes including the AF4 mode or the AF6 mode bycomparing results of steps S1710, S1730, S1750 and S1760 (S1770).

Then, the encoder may generate a motion vector predictor of the currentblock based on the optimal coding mode and obtain a motion vectordifference by subtracting the motion vector predictor from the motionvector of the current block.

Thereafter, the encoding/decoding processes described in FIGS. 1 and 2may be equally applied.

FIG. 18 is a flowchart illustrating a process of adaptively performingdecoding according to the AF4 mode or the AF6 mode based on at least oneof condition B and condition C as an embodiment (3-2) to which thepresent disclosure is applied.

A decoder may check whether a coding mode of the current block is the AFmode (S1810). Here, the AF mode refers to an affine motion predictionmode using an affine motion model, embodiments described in thedescription may be applied, and redundant description is omitted.

When the AF mode is executed on the current block, the decoder may checkwhether condition B is satisfied for the current block (S1820). Here,condition B may refer to a condition with respect to a block size. Forexample, the embodiments of Table 2 may be applied and redundantdescription is omitted.

When condition B is satisfied, the decoder may perform motion vectorprediction based on the AF4 mode (S1830).

When condition B is not satisfied, the decoder may check whethercondition C is satisfied for the current block (S1840). Here, conditionC may refer to a condition with respect to a block size. For example,the embodiments of Table 3 may be applied and redundant description isomitted.

Meanwhile, in condition B and condition C, the threshold values TH2 andTH3 may be determined such that they satisfy Equation 5.

When condition C is satisfied, the decoder may perform motion vectorprediction based on the AF6 mode (S1860).

When condition C is not satisfied, the decoder may check whether the AF4mode is applied to the current block (S1850).

Here, step S1850 may be checked by an affine parameter flag indicatingwhether the AF4 mode is executed (or whether affine motion prediction isperformed using four parameters).

For example, the affine parameter flag may be represented byaffine_param_flag. When affine_param_flag=0, this may represent thatmotion vector prediction is performed according to the AF4 mode (S1830).When affine_param_flag=1, this may represent that motion vectorprediction is performed according to the AF6 mode (S1860). However, thepresent disclosure is not limited thereto.

Meanwhile, when the AF mode is not executed on the current block, thedecoder may perform decoding (i.e., motion vector prediction) accordingto a coding mode other than the AF mode (S1870). For example, the skipmode, the merge mode or the inter mode may be used.

FIG. 19 illustrates a syntax structure in which decoding is performedaccording to the AF4 mode or the AF6 mode based on at least one ofcondition B and condition C as an embodiment (3-3) to which the presentdisclosure is applied.

A decoder may obtain merge_flag and check whether the merge mode isapplied to the current block (S1910).

When the merge mode is not applied to the current block, the decoder mayobtain affine_flag (S1920). Here, affine_flag indicates whether the AFmode is executed.

When affine_flag=1, that is, when the AF mode is executed on the currentblock, the decoder may check whether condition B is satisfied (S1620).Here, condition B may refer to a condition with respect to a block size.For example, the embodiments of Table 2 may be applied.

When condition B is satisfied, the decoder may set affine_param_flag to0 (S1930). Here, affine_param_flag indicates whether the AF4 mode isexecuted (or whether affine motion prediction is executed using fourparameters). affine_param_flag=0 represents that motion vectorprediction is performed according to the AF4 mode.

When condition B is not satisfied and condition C is satisfied, thedecoder may set affine_param_flag to 1 (S1940). Here,affine_param_flag=1 represents that motion vector prediction isperformed according to the AF6 mode.

When both condition B and condition C are not satisfied, the decoder mayobtain affine_param_flag (S1950).

When affine_param_flag=0, the decoder may obtain two motion vectordifferences of mvd_CP0 and mvd_CP1 (S1960).

When affine_param_flag=1, the decoder may obtain three motion vectordifferences of mvd_CP0, mvd_CP1 and mvd_CP2 (S1970).

FIG. 20 is a flowchart illustrating a process of adaptively determiningan optimal coding mode from among motion vector prediction modesincluding the AF4 mode or the AF6 mode based on a coding mode of aneighbor block as an embodiment (4-1) to which the present disclosure isapplied.

An encoder may perform prediction based on at least one of the skipmode, the merge mode and the inter mode (S2010).

The encoder may check whether a neighbor block has been coded in the AFmode (S2020). Here, whether the neighbor block has been coded in the AFmode may be represented by isNeighborAffine( ) For example, whenisNeighborAffine( )=0, this can indicate that the neighbor block has notbeen coded in the AF mode. When isNeighborAffine( )=1, this can indicatethat the neighbor block has been coded in the AF mode.

When the neighbor block has not been coded in the AF mode, the encodermay perform motion vector prediction based on the AF4 mode (S2030).

When the neighbor block has been coded in the AF mode, the encoder mayperform motion vector prediction based on the AF4 mode and also performmotion vector prediction based on the AF6 mode (S2040).

The encoder may determine an optimal coding mode from among motionvector prediction modes including the AF4 mode or the AF6 mode bycomparing results of steps S2030 and S2040 (S2050).

Then, the encoder may generate a motion vector predictor of the currentblock based on the optimal coding mode and obtain a motion vectordifference by subtracting the motion vector predictor from the motionvector of the current block.

Thereafter, the encoding/decoding processes described in FIGS. 1 and 2may be equally applied.

FIG. 21 is a flowchart illustrating a process of adaptively performingdecoding according to the AF4 mode or the AF6 mode based on a codingmode of a neighbor block as an embodiment (4-2) to which the presentdisclosure is applied.

A decoder may receive a bitstream (S2110). The bitstream may includeinformation about a coding mode of a current block in a video signal.

The decoder may check whether a coding mode of the current block is theAF mode (S2120).

When the AF mode is not executed on the current block, the decoder mayperform decoding (i.e., motion vector prediction) according to a codingmode other than the AF mode (S2170). For example, the skip mode, themerge mode or the inter mode may be used.

When the AF mode is executed on the current block, the decoder may checkwhether the neighbor block has been coded in the AF mode (S2130). Here,whether the neighbor block has been coded in the AF mode may berepresented by isNeighborAffine( ). For example, when isNeighborAffine()=0, this can indicate that the neighbor block has not been coded in theAF mode. When isNeighborAffine( )=1, this can indicate that the neighborblock has been coded in the AF mode.

When the neighbor block has been coded in the AF mode, the decoder mayperform motion vector prediction based on the AF4 mode (S2140).

When the neighbor block has not been coded in the AF mode, the decodermay check whether the AF4 mode is applied to the current block (S2150).

Here, step S2150 may be checked by an affine parameter flag indicatingwhether the AF4 mode is executed (or whether affine motion prediction isexecuted using four parameters). For example, the affine parameter flagmay be represented by affine_param_flag. When affine_param_flag=0,motion vector prediction is performed according to the AF4 mode (S2140).When affine_param_flag=1, motion vector prediction is performedaccording to the AF6 mode (S2160).

FIG. 22 illustrates a syntax structure in which decoding is performedaccording to the AF4 mode or the AF6 mode based on a coding mode of aneighbor block as an embodiment (4-3) to which the present disclosure isapplied.

A decoder may obtain merge_flag and check whether the merge mode isapplied to the current block (S2210).

When the merge mode is not applied to the current block, the decoder mayobtain affine_flag (S2220). Here, affine_flag indicates whether the AFmode is executed.

When affine_flag=1, that is, when the AF mode is executed on the currentblock, the decoder may check whether the neighbor block has been codedin the AF mode (S2230).

When the neighbor block has been coded in the AF mode, the decoder mayobtain affine_param_flag (S2230). Here, affine_param_flag indicateswhether the AF4 mode is executed (or whether affine motion prediction isexecuted using four parameters).

When the neighbor block has not been coded in the AF mode, the decodermay set affine_param_flag to 0 (S2240).

When affine_param_flag=0, that is, when motion vector prediction isperformed according to the AF4 mode, the decoder may obtain two motionvector differences of mvd_CP0 and mvd_CP1 (S2250).

When affine_param_flag=1, that is, when motion vector prediction isperformed according to the AF6 mode, the decoder may obtain three motionvector differences of mvd_CP0, mvd_CP1 and mvd_CP2 (S2260).

FIG. 23 is a flowchart illustrating a process of adaptively determiningan optimal coding mode from among motion vector prediction modesincluding the AF4 mode or the AF6 mode based on at least one ofcondition A, condition B and condition C as an embodiment (5-1) to whichthe present disclosure is applied.

The present disclosure provide an embodiment that is a combination ofthe second embodiment and the third embodiment. FIG. 23 illustrates anexample in which all conditions A, B and C are considered, and theconditions may be applied in different orders.

Referring to FIG. 23 , an encoder may perform prediction based on atleast one of the skip mode, the merge mode and the inter mode (S2310).

The encoder may check whether condition A is satisfied for the currentblock (S2320). Here, condition A may refer to a condition with respectto a block size and the embodiments of Table 1 above may be appliedthereto.

When condition A is satisfied, the encoder may determine an optimalcoding mode from among modes other than the AF mode (S2330).

On the other hand, when condition A is not satisfied, the encoder maycheck whether condition B is satisfied for the current block (S2330).Here, condition B may refer to a condition with respect to a block sizeand the embodiments of Table 2 above may be applied thereto.

When condition B is satisfied, the encoder may perform motion vectorprediction based on the AF4 mode (S2340).

When condition B is not satisfied, the encoder may check whethercondition C is satisfied for the current block (S2350). Here, conditionC may refer to a condition with respect to a block size and theembodiments of Table 3 may be applied thereto.

When condition C is satisfied, the encoder may perform motion vectorprediction based on the AF6 mode (S2370).

When condition C is not satisfied, the encoder may perform motion vectorprediction based on the AF4 mode and perform motion vector predictionbased on the AF6 mode (S2360).

Meanwhile, in condition B and condition C, the threshold values TH2 andTH3 may be determined such that they satisfy Equation 5.

The encoder may determine an optimal coding mode by comparing results ofsteps S2310, S2340, S2360 and S2370 (2380).

Then, the encoder may generate a motion vector predictor of the currentblock based on the optimal coding mode and obtain a motion vectordifference by subtracting the motion vector predictor from the motionvector of the current block.

Thereafter, the encoding/decoding processes described in FIGS. 1 and 2may be equally applied.

FIG. 24 is a flowchart illustrating a process of adaptively performingdecoding according to the AF4 mode or the AF6 mode based on at least oneof condition A, condition B and condition C as an embodiment (5-2) towhich the present disclosure is applied.

A decoder may check whether condition A is satisfied for the currentblock (S2410). Here, condition A may refer to a condition with respectto a block size. For example, the embodiments of Table 1 above may beapplied thereto.

When condition A is satisfied, the decoder may check whether a codingmode of the current block is the AF mode (S2420). Here, the AF moderefers to an affine motion prediction mode using an affine motion model,embodiments described in the description may be applied, and redundantdescription is omitted.

When condition A is not satisfied or the AF mode is not executed on thecurrent block, the decoder may perform decoding (i.e., motion vectorprediction) according to a coding mode other than the AF mode (S2480).For example, the skip mode, the merge mode or the inter mode may beused.

When the AF mode is executed on the current block, the decoder checkwhether condition B is satisfied for the current block (S2430). Here,condition B may refer to a condition with respect to a block size. Forexample, the embodiments of Table 2 may be applied thereto and redundantdescription is omitted.

When condition B is satisfied, the decoder may perform motion vectorprediction based on the AF4 mode (S2440).

When condition B is not satisfied, the decoder may check whethercondition C is satisfied for the current block (S2450). Here, conditionC may refer to a condition with respect to a block size. For example,the embodiments of Table 3 may be applied thereto and redundantdescription is omitted.

Meanwhile, in condition B and condition C, the threshold values TH2 andTH3 may be determined such that they satisfy Equation 5.

When condition C is satisfied, the decoder may perform motion vectorprediction based on the AF6 mode (S2470).

When condition C is not satisfied, the decoder may check whether the AF4mode is applied to the current block (S2460).

Here, step S2460 may be checked by an affine parameter flag indicatingwhether the AF4 mode is executed (or whether affine motion prediction isperformed using four parameters).

For example, the affine parameter flag may be represented byaffine_param_flag. When affine_param_flag=0, this may represent thatmotion vector prediction is performed according to the AF4 mode (S2440).When affine_param_flag=1, this may represent that motion vectorprediction is performed according to the AF6 mode (S2470). However, thepresent disclosure is not limited thereto.

FIG. 25 illustrates a syntax structure in which decoding is performedaccording to the AF4 mode or the AF6 mode based on at least one ofcondition A, condition B and condition C as an embodiment (5-3) to whichthe present disclosure is applied.

A decoder may obtain merge_flag and check whether the merge mode isapplied to the current block (S2510).

When the merge mode is not applied to the current block, the decoder maycheck whether condition A is satisfied (S2520). Here, condition A mayrefer to a condition with respect to a block size. For example, theembodiments of Table 1 above may be applied thereto.

When condition A is satisfied, the decoder may obtain affine_flag(S2520). Here, affine_flag indicates whether the AF mode is executed.

When affine_flag=1, that is, when the AF mode is executed on the currentblock, the decoder may check whether condition B is satisfied (S2530).Here, condition B may refer to a condition with respect to a block size.For example, the embodiments of Table 2 may be applied thereto.

When condition B is satisfied, the decoder may set affine_param_flag to0 (S2540). Here, affine_param_flag indicates whether the AF4 mode isexecuted (or whether affine motion prediction is executed using fourparameters). affine_param_flag=0 represents that motion vectorprediction is performed according to the AF4 mode.

When condition B is not satisfied and condition C is satisfied, thedecoder may set affine_param_flag to 1 (S2550). Here,affine_param_flag=1 represents that motion vector prediction isperformed according to the AF6 mode.

When both condition B and condition C are not satisfied, the decoder mayobtain affine_param_flag (S2560).

When affine_param_flag=0, the decoder may obtain two motion vectordifferences of mvd_CP0 and mvd_CP1 (S2570).

When affine_param_flag=1, the decoder may obtain three motion vectordifferences of mvd_CP0, mvd_CP1 and mvd_CP2 (S2580).

FIG. 26 is a flowchart illustrating a process of adaptively determiningan optimal coding mode from among motion vector prediction modesincluding the AF4 mode or the AF6 mode based on at least one ofcondition A and a coding mode of the neighbor block as an embodiment(6-1) to which the present disclosure is applied.

An encoder may perform prediction based on at least one of the skipmode, the merge mode and the inter mode (S2610).

The encoder may check whether condition A is satisfied for the currentblock (S2620). Here, condition A may refer to a condition with respectto a block size and the embodiments of Table 1 above may be appliedthereto.

When condition A is satisfied, the encoder may determine an optimalcoding mode from among modes other than the AF mode (S2660).

On the other hand, when condition A is not satisfied, the encoder maycheck whether a neighbor block has been coded in the AF mode (S2630).Here, whether the neighbor block has been coded in the AF mode may berepresented by isNeighborAffine( ). For example, when isNeighborAffine()=0, this can indicate that the neighbor block has not been coded in theAF mode. When isNeighborAffine( )=1, this can indicate that the neighborblock has been coded in the AF mode.

When the neighbor block has not been coded in the AF mode, the encodermay perform motion vector prediction based on the AF4 mode (S2640).

When the neighbor block has been coded in the AF mode, the encoder mayperform motion vector prediction based on the AF4 mode and also performmotion vector prediction based on the AF6 mode (S2650).

The encoder may determine an optimal coding mode by comparing results ofsteps S2610, S2640 and S2650 (S2660).

Then, the encoder may generate a motion vector predictor of the currentblock based on the optimal coding mode and obtain a motion vectordifference by subtracting the motion vector predictor from the motionvector of the current block.

Thereafter, the encoding/decoding processes described in FIGS. 1 and 2may be equally applied.

FIG. 27 is a flowchart illustrating a process of adaptively performingdecoding according to the AF4 mode or the AF6 mode based on at least oneof condition A and a coding mode of a neighbor block as an embodiment(6-2) to which the present disclosure is applied.

A decoder may receive a bitstream (S2710). The bitstream may includeinformation about a coding mode of a current block in a video signal.

The decoder may check whether condition A is satisfied for the currentblock in order to determine an optimal coding mode for motion vectorprediction (S2720). Here, condition A may refer to a condition withrespect to a block size. For example, the embodiments of Table 1 abovemay be applied thereto.

When condition A is satisfied, the decoder may check whether the codingmode of the current block is the AF mode (S2730).

Details described in S2120 to S2170 of FIG. 21 can be applied to thefollowing steps S2730 to S2780 and redundant description is omitted.

FIG. 28 illustrates a syntax structure in which decoding is performedaccording to the AF4 mode or the AF6 mode based on at least one ofcondition A and a coding mode of a neighbor block as an embodiment (6-3)to which the present disclosure is applied.

A decoder may obtain merge_flag and check whether the merge mode isapplied to the current block (S2810).

When the merge mode is not applied to the current block, the decoder maycheck whether condition A is satisfied (S2820). Here, condition A mayrefer to a condition with respect to a block size. For example, theembodiments of Table 1 above may be applied thereto.

When condition A is satisfied, the decoder may obtain affine_flag(S2820). Here, affine_flag indicates whether the AF mode is executed.

Details described in S2230 to S2260 of FIG. 22 can be applied to thefollowing steps S2830 to S2860 and redundant description is omitted.

FIG. 29 is a flowchart illustrating a process of generating a motionvector predictor based on at least one of the AF4 mode and the AF6 modeas an embodiment to which the present disclosure is applied.

A decoder may check whether an AF mode is applied to the current block(S2910). Here, the AF mode represents a motion prediction mode using anaffine motion model.

For example, the decoder may acquire an affine flag from a video signaland check whether the AF mode is applied to the current block based onthe affine flag.

When the AF mode is applied to the current block, the decoder may checkwhether the AF4 mode is used (S2920). Here, the AF4 mode represents amode in which a motion vector is predicted using four parametersconstituting the affine motion model.

For example, when the affine flag indicates that the AF mode is appliedto the current block, the decoder may obtain an affine parameter flagfrom the video signal, and the affine parameter flag indicates whetherthe motion vector predictor is generated using the four parameters orsix parameters.

Here, the affine flag and the affine parameter flag may be defined at atleast one level of a slice, a largest coding unit, a coding unit and aprediction unit.

The decoder may generate the motion vector predictor using the fourparameters when the AF4 mode is used and generate the motion vectorpredictor using six parameters constituting the affine motion model whenthe AF4 mode is not used (S2930).

The decoder may obtain a motion vector of the current block based on themotion vector predictor (S2940).

In an embodiment, the decoder may check whether the size of the currentblock satisfies a predetermined condition. Here, the predeterminedcondition represents whether at least one of the number of pixels in thecurrent block, the width and/or the height of the current block isgreater than a predetermined threshold value.

For example, when the size of the current block satisfies thepredetermined condition, the decoder may check whether the AF mode isapplied to the current block.

On the other hand, when the size of the current block does not satisfythe predetermined condition, the current block may be decoded based on acoding mode other than the AF mode.

In an embodiment, the decoder may check whether the AF mode has beenapplied to a neighbor block when the AF mode is applied to the currentblock.

The motion vector predictor is generated using the four parameters whenthe AF mode has been applied to the neighbor block, and the decoder mayperform the step of checking whether the AF4 mode is used when the AFmode has not been applied to the neighbor block.

FIG. 30 is a flowchart illustrating a process of generating a motionvector predictor based on AF4_flag and AF6_flag as an embodiment towhich the present disclosure is applied.

A decoder may obtain at least one of AF4 flag and AF6 flag from a videosignal (S3010). Here, AF4_flag indicates whether the AF4 mode isexecuted on the current block and AF6_flag indicates whether the AF6mode is executed on the current block.

Here, at least one of AF4_flag and AF6_flag may be defined at a slicelevel and additionally defined at a block level or a prediction unitlevel. However, the present disclosure is not limited thereto, and atleast one of AF4_flag and AF6_flag may be defined at at least one levelof a slice, a largest coding unit, a coding unit and a prediction unit.

The decoder may check values of AF4_flag and AF6_flag (S3020).

The AF4 mode is executed on the current block when AF4_flag=1 and theAF4 mode is not executed on the current block when AF4_flag=0. Here,execution of the AF4 mode means execution of motion vector predictionusing an affine motion model represented by four parameters.

The AF6 mode is executed on the current block when AF6_flag=1 and theAF6 mode is not executed on the current block when AF6_flag=0. Here,execution of the AF6 mode means execution of motion vector predictionusing an affine motion model represented by four parameters.

When AF4_flag=0 and AF6_flag=0, the decoder may perform motion vectorprediction according to a mode other than the AF4 mode and the AF6 mode(S3030).

When AF4_flag=1 and AF6_flag=0, the decoder may perform motion vectorprediction according to the AF4 mode (S3040).

When AF4_flag=0 and AF6_flag=0, the decoder may perform motion vectorprediction according to the AF6 mode (S3050).

When AF4_flag=1 and AF6_flag=1, the decoder may perform motion vectorprediction according to the AF4 mode or the AF6 mode (S3060).

FIG. 31 is a flowchart illustrating a process of adaptively performingdecoding according to the AF4 mode or the AF6 mode based on whether aneighbor block has been coded in an AF mode as an embodiment to whichthe present disclosure is applied.

A decoder may check whether the AF mode is applied to the current block(S3110).

The decoder may check whether a neighbor block has been coded in the AFmode when the AF mode is applied to the current block (S3120).

The decoder may obtain at least one of AF4_flag and AF6_flag when theneighbor block has been coded in the AF mode (S3130).

The decoder may generate a motion vector predictor using four parametersor six parameters based on at least one of AF4_flag and AF6_flag(S3140). For example, the decoder may perform motion vector predictionaccording to the AF4 mode when AF4_flag=1 and may perform motion vectorprediction according to the AF6 mode when AF6_flag=1.

The decoder may obtain a motion vector of the current block based on themotion vector predictor (S3150).

FIG. 32 illustrates a syntax in which decoding is adaptively performedbased on AF4_flag and AF6_flag as an embodiment to which the presentdisclosure is applied.

A decoder may obtain AF4_flag and AF6_flag at a slice level (S3010).Here, AF4_flag indicates whether the AF4 mode is executed on the currentblock and AF6_flag indicates whether the AF6 mode is executed on thecurrent block. AF4_flag may be represented by affine_4_flag and AF6_flagmay be represented by affine_6_flag.

The decoder may adaptively perform decoding based on AF4_flag andAF6_flag at a block level or a prediction unit level.

When affine_4_flag is not 0 or affine_6_flag is not 0 (that is, in casesother than affine_4_flag=0 && affine_6_flag=0), the decoder may obtainan affine flag (S3220). The affine flag can indicate whether the AF modeis executed.

When the AF mode is executed, the decoder may adaptively performdecoding according to the values of AF4_flag and AF6_flag.

When affine_4_flag=1 && affine_6_flag=0, the decoder may setaffine_param_flag to 0. That is, affine_param_flag=0 represents that theAF4 mode is executed.

When affine_4_flag=0 && affine_6_flag=1, the decoder may setaffine_param_flag to 1. That is, affine_param_flag=1 represents that theAF6 mode is executed.

When affine_4_flag=1 && affine_6_flag=1, the decoder may parse or obtainaffine_param_flag. Here, the decoder may perform decoding in the AF4mode or the AF6 mode according to the value of affine_param_flag at ablock level or a prediction unit level.

The above-described embodiments may be applied to other syntaxstructures and redundant description is omitted.

FIG. 33 illustrates a syntax in which decoding is adaptively performedaccording to the AF4 mode or the AF6 mode based on whether a neighborblock has been coded in an AF mode as an embodiment to which the presentdisclosure is applied.

In the present embodiment, the above description can be applied toredundant parts in FIGS. 32 and 33 and only different parts aredescribed.

When affine_4_flag=1 && affine_6_flag=1, the decoder may check whether aneighbor block has been coded in the AF mode.

When the neighbor block has been coded in the AF mode, the decoder mayparse or obtain affine_param_flag (S3310). Here, the decoder may performdecoding in the AF4 mode or the AF6 mode according to the value ofaffine_param_flag at a block level or a prediction unit level.

On the other hand, when the neighbor block has not been coded in the AFmode, the decoder may set affine_param_flag to 0. That is,affine_param_flag=0 represents that the AF4 mode is executed.

FIG. 34 illustrates a video coding system to which the presentdisclosure is applied.

The video coding system may include a source device and a receivingdevice. The source device may transmit encoded video/image informationor data in a file or streaming format to the receiving device through adigital storage medium or a network.

The source device may include a video source, an encoding apparatus, anda transmitter. The receiving device may include a receiver, a decodingapparatus, and a renderer. The encoding apparatus may be called avideo/image encoding apparatus and the decoding apparatus may be calleda video/image decoding apparatus. The transmitter may be included in theencoding apparatus. The receiver may be included in the decodingapparatus. The renderer may include a display and the display may beconfigured in the form of a separate device or an external component.

The video source may obtain a video/image through video/image capture,combination, generation, or the like. The video source may include avideo/image capture device and/or a video/image generation device. Thevideo/image capture device may include one or more cameras, avideo/image archive including previously captured video/images, and thelike, for example. The video/image generation device may include acomputer, a tablet, a smartphone, and the like, for example, and(electronically) generate a video/image. For example, a virtualvideo/image can be generated through a computer or the like, and in thiscase, a process of generating related data may be replaced by avideo/image capture process.

The encoding apparatus can encode a video/image. The encoding apparatuscan perform a series of procedures such as prediction, transformationand quantization for compression and coding efficiency. Encoded data(encoded video/image information) may be output in the form of abitstream.

The transmitter may transmit encoded video/image information or dataoutput in the form of a bitstream to the receiver of the receivingdevice in a file or streaming format through a digital storage medium ora network. The digital storage medium may include various storage mediasuch as a USB, an SD, a CD, a DVD, Blueray, an HDD, and an SSD. Thetransmitter may include an element for generating a media file through apredetermined file format and an element for transmission through abroadcast/communication network. The receiver may extract the bitstreamand transmit the bitstream to the decoding apparatus.

The decoding apparatus can decode a video/image by performing a seriesof procedures such as dequantization, inverse transformation andprediction corresponding to operation of the encoding apparatus.

The renderer can render the decoded video/image. The renderedvideo/image may be displayed through a display.

FIG. 35 illustrates a content streaming system to which the presentdisclosure is applied.

Referring to FIG. 35 , the content streaming system to which the presentdisclosure is applied may include an encoding server, a streamingserver, a web server, a media storage, a user equipment, and multimediainput devices.

The encoding server serves to compress content input from multimediainput devices such as a smartphone, a camera and a camcorder intodigital content to generate bitstreams and transmit the bitstreams tothe streaming server. As another example, when the multimedia inputdevices such as a smartphone, a camera and a camcorder directly generatebitstreams, the encoding server may be omitted.

The bitstreams may be generated through an encoding method or abitstream generation method to which the present disclosure is applied,and the streaming server may temporarily store the bitstreams in theprocess of transmitting or receiving the bitstreams.

The streaming server serves to transmit multimedia data to the userequipment based on a user request through the web server, and the webserver serves as a medium that informs a user of available services.When the user requests a desired service from the web server, the webserver transmits the request to the streaming server and the streamingserver transmits multimedia data to the user. Here, the contentstreaming system may include an additional control server. In this case,the control server serves to control commands/responses between devicesin the content streaming system.

The streaming server may receive content from the media storage and/orthe encoding server. For example, when content is received from theencoding server, the streaming server can receive the content in realtime. In this case, the streaming server can store bitstreams for apredetermined time in order to provide a smooth streaming service.

Examples of the user equipment may include cellular phones, smartphones, laptop computers, digital broadcast terminals, PDA (personaldigital assistants), a PMP (portable multimedia player), navigationsystems, slate PCs, tablet PCs, ultrabooks, wearable devices (e.g., asmartwatch, a smart glass, and an HMD (head mounted display)), digitalTV, desktop computers, digital signage, and the like.

The servers in the content streaming system may be operated asdistributed servers. In this case, data received by the servers can beprocessed in a distributed manner.

As described above, the embodiments described in the present disclosuremay be implemented and executed on a processor, microprocessor,controller or chip. For example, the function units shown in FIGS. 1, 2,34 and 35 may be implemented and executed on a computer, processor,microprocessor, controller or chip.

Furthermore, the decoder and the encoder to which the present disclosureis applied may be included in a multimedia broadcastingtransmission/reception apparatus, a mobile communication terminal, ahome cinema video apparatus, a digital cinema video apparatus, asurveillance camera, a video chatting apparatus, a real-timecommunication apparatus such as video communication, a mobile streamingapparatus, a storage medium, a camcorder, a VoD service providingapparatus, an Internet streaming service providing apparatus, athree-dimensional (3D) video apparatus, a teleconference videoapparatus, and a medical video apparatus, and may be used to processvideo signals and data signals

Furthermore, the decoding/encoding method to which the presentdisclosure is applied may be produced in the form of a program that isto be executed by a computer and may be stored in a computer-readablerecording medium. Multimedia data having a data structure according tothe present disclosure may also be stored in computer-readable recordingmedia. The computer-readable recording media include all types ofstorage devices in which data readable by a computer system is stored.The computer-readable recording media may include a BD, a USB, ROM, RAM,CD-ROM, a magnetic tape, a floppy disk, and an optical data storagedevice, for example. Furthermore, the computer-readable recording mediaincludes media implemented in the form of carrier waves, e.g.,transmission through the Internet. Furthermore, a bit stream generatedby the encoding method may be stored in a computer-readable recordingmedium or may be transmitted over wired/wireless communication networks.

INDUSTRIAL APPLICABILITY

The exemplary embodiments of the present disclosure have been disclosedfor illustrative purposes, and those skilled in the art may improve,change, replace, or add various other embodiments within the technicalspirit and scope of the present disclosure disclosed in the attachedclaims.

What is claimed is:
 1. An apparatus for decoding a video signalincluding a current block based on an affine motion prediction mode (AFmode), comprising: a processor configured to obtain a merge flag fromthe video signal, wherein the merge flag represents whether merge modeis applied to the current block, based on that the merge mode notapplied to the current block, obtain an affine flag from the videosignal based on that a width and a height of the current block is equalto or larger than 16, wherein the affine flag represents whether the AFmode is applied to the current block, and the AF mode represents amotion prediction mode using an affine motion model, obtain an affineparameter flag representing whether 4 parameters or 6 parameters areused for the affine motion model based on that the AF mode is applied tothe current block, obtain a motion vector predictor based on the 4parameters or the 6 parameters being used for the affine motion model,obtain prediction samples for the current block based on the motionvector predictor, obtain residual samples for the current block,reconstruct the current block based on the prediction samples and theresidual samples, and filter the reconstructed current block, whereinthe affine flag and the affine parameter flag are obtained based on thatthe merge flag represents the merge mode is not applied to the currentblock.
 2. The apparatus of claim 1, wherein the affine flag and theaffine parameter flag are defined in a level of a coding unit.
 3. Theapparatus of claim 1, wherein the current block is decoded based on acoding mode other than the AF mode based on that the width and theheight of the current block is smaller than the predetermined value. 4.An apparatus for encoding a video signal including a current block basedon an affine motion prediction mode (AF mode), the apparatus comprising:a processor configured to generate a merge flag representing whethermerge mode is applied to the current block, based on that the merge modenot applied to the current block, generate an affine flag based on thata width and a height of the current block is equal to or larger than 16,wherein the affine flag represents whether the AF mode is applied to thecurrent block, and the AF mode represents a motion prediction mode usingan affine motion model, generate an affine parameter flag representingwhether 4 parameters or 6 parameters are used for the affine motionmodel based on that the AF mode is applied to the current block, obtaina motion vector predictor based on the 4 parameters or the 6 parametersbeing used for the affine motion model, generate prediction samples forthe current block based on the motion vector predictor, generateresidual samples for the current block based on the prediction samples,and perform a transform, a quantization and entropy-encoding for theresidual samples, wherein the current block is reconstructed based onthe prediction samples and the residual samples, wherein the affine flagand the affine parameter flag are generated based on that the merge flagrepresents the merge mode is not applied to the current block.
 5. Theapparatus of claim 4, wherein the affine flag and the affine parameterflag are defined in a level of a coding unit.
 6. The apparatus of claim4, wherein the current block is encoded based on a coding mode otherthan the AF mode based on that the width and the height of the currentblock is smaller than the predetermined value.
 7. A non-transitorycomputer-readable storage medium storing encoded picture informationgenerated by performing the steps of: generating a merge flagrepresenting whether merge mode is applied to the current block, basedon that the merge mode not applied to the current block, generating anaffine flag based on that a width and a height of the current block isequal to or larger than 16, wherein the affine flag represents whetheran AF mode is applied to the current block, and the AF mode represents amotion prediction mode using an affine motion model, generating anaffine parameter flag representing whether 4 parameters or 6 parametersare used for the affine motion model based on that the AF mode isapplied to the current block, obtaining a motion vector predictor basedon the 4 parameters or the 6 parameters being used for the affine motionmodel, generating prediction samples for the current block based on themotion vector predictor, generating residual samples for the currentblock based on the prediction samples, and performing a transform, aquantization and entropy-encoding for the residual samples, wherein thecurrent block is reconstructed based on the prediction samples and theresidual samples, wherein the affine flag and the affine parameter flagare generated based on that the merge flag represents the merge mode isnot applied to the current block.
 8. A method of transmitting abitstream generated by performing the steps of: generating a merge flagrepresenting whether merge mode is applied to the current block, basedon that the merge mode not applied to the current block, generating anaffine flag based on that a width and a height of the current block isequal to or larger than 16, wherein the affine flag represents whetheran AF mode is applied to the current block, and the AF mode represents amotion prediction mode using an affine motion model, generating anaffine parameter flag representing whether 4 parameters or 6 parametersare used for the affine motion model based on that the AF mode isapplied to the current block, obtaining a motion vector predictor basedon the 4 parameters or the 6 parameters being used for the affine motionmodel, generating prediction samples for the current block based on themotion vector predictor, generating residual samples for the currentblock based on the prediction samples, performing a transform, aquantization and entropy-encoding for the residual samples, andtransmitting the bitstream including the entropy-encoded residualsamples, wherein the current block is reconstructed based on theprediction samples and the residual samples, wherein the affine flag andthe affine parameter flag are generated based on that the merge flagrepresents the merge mode is not applied to the current block.